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Abstract 

Background: Apple tree breeding is slow and difficult due to long generation times, self-incompatibility, and 
complex genetics. The identification of molecular markers linked to traits of interest is a way to expedite the 
breeding process. In the present study, we aimed to identify genes whose steady-state transcript abundance was 
associated with inheritance of specific traits segregating in an apple {Malus x domestica) rootstock Fi breeding 
population, including resistance to powdery mildew {Podosphaera leucotricha) disease and woolly apple aphid 
{Eriosoma lanigerum). 

Results: Transcription profiling was performed for 48 individual Fi apple trees from a cross of two highly 
heterozygous parents, using RNA isolated from healthy, actively-growing shoot tips and a custom apple DNA 
oligonucleotide microarray representing 26,000 unique transcripts. Genome-wide expression profiles were not clear 
indicators of powdery mildew or woolly apple aphid resistance phenotype. However, standard differential gene 
expression analysis between phenotypic groups of trees revealed relatively small sets of genes with trait-associated 
expression levels. For example, thirty genes were identified that were differentially expressed between trees resistant 
and susceptible to powdery mildew. Interestingly, the genes encoding twenty-four of these transcripts were 
physically clustered on chromosome 12. Similarly, seven genes were identified that were differentially expressed 
between trees resistant and susceptible to woolly apple aphid, and the genes encoding five of these transcripts 
were also clustered, this time on chromosome 17. In each case, the gene clusters were in the vicinity of previously 
identified major quantitative trait loci for the corresponding trait. Similar results were obtained for a series of 
molecular traits. Several of the differentially expressed genes were used to develop DNA polymorphism markers 
linked to powdery mildew disease and woolly apple aphid resistance. 

Conclusions: Gene expression profiling and trait-associated transcript analysis using an apple Fi population readily 
identified genes physically linked to powdery mildew disease resistance and woolly apple aphid resistance loci. This 
result was especially useful in apple, where extreme levels of heterozygosity make the development of reliable DNA 
markers quite difficult. The results suggest that this approach could prove effective in crops with complicated 
genetics, or for which few genomic information resources are available. 
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Background 

The advent of large-scale transcription profiling technolo- 
gies, such as DNA microarrays [1] and RNAseq technology 
[2], has allowed the analysis of gene expression patterns at 
the genome level. DNA microarrays have been used for 
genetic mapping studies based on polymorphisms between 
parental genotypes [1]. When used to analyze genetically 
segregating populations, DNA microarrays have facilitated 
the discovery of gene expression markers [2]. Gene expres- 
sion markers can be defined as transcripts encoded by 
genes whose relative messenger RNA expression level is 
inherited and segregates as a phenotypic trait [3]. 

Apple tree breeding is a slow and difficult process for 
reasons that include long juvenility periods, large size of 
mature plants, inbreeding depression, reproductive self- 
incompatibility, and complex phenotypes related to 
grafted trees [4]. The use of molecular markers for 
marker-assisted breeding and selection has the potential 
to expedite this process by increasing the percentage of 
desired genotypes and associated phenotypes early on in 
the breeding pipeline and assisting breeders in combin- 
ing desirable traits from different parents into breeding 
progenies [5,6]. In the present study, our objective was 
to identify genes whose steady-state expression level in 
healthy, uninfected apple shoot tips correlated with the 
inheritance of agriculturally important traits in an apple 
rootstock breeding population. These transcripts would 
have the potential to be used as molecular markers by 
themselves, or could be used to develop DNA poly- 
morphism markers for marker-assisted selection and 
gene mapping in the population. 

Genetics in apple is typically done in the Fi generation 
due to the self-incompatibility of apple [7]. The apple 
rootstock population used for this study was an Fi popu- 
lation from a cross of two highly heterozygous rootstock 
parents, 'Ottawa 3' (03) and 'Robusta 5' (R5). This is a 
phenotypically diverse and well-characterized population 
that is segregating for numerous traits of interest to apple 
growers. The segregating traits include resistance to biotic 
stresses such as fire blight {Erwinia amylovora; [8]) and 
powdery mildew (PM, Podosphaera leucotricha; [9]) dis- 
eases and resistance to the woolly apple aphid (WAA; 
Eriosoma lanigerum; [8,10]) pest. 

We applied DNA microarray transcription profiling to 
48 individual Fi trees from the 03 x R5 cross and identi- 
fied transcripts with expression levels associated with 
PM disease and WAA pest resistance phenotypes. When 
the genes encoding these transcripts were mapped to 
the apple genome, they were found to be physically clus- 
tered. This is similar to the phenomenon described for 
single nucleotide polymorphisms and gene expression 
markers in Brassica napus [11]. The utility of using 
physically clustered, differentially expressed genes for 
DNA marker development will be discussed. 



Results 

Microarrays 

RNA was isolated from healthy, uninfected, uninfested 
shoot tips collected from 48 individual Fi trees from the 
03 X R5 cross population growing in a rootstock pro- 
duction stool bed in Geneva, NY. The RNA was used to 
probe 24 microarrays in multi-plex format, using two 
different color probes per array, so that all 48 RNA sam- 
ples could be assayed on the 24 microarrays. The micro- 
arrays were clustered based on their expression profiles 
using the hierarchical clustering function in R (complete 
linkage. Figure 1). The array clustering groups did not 
consistently correspond with either PM resistance or 
WAA resistance, with closely clustered arrays often in- 
cluding Fi individuals with contrasting phenotypes for 
PM and WAA resistance (Figure 1). 

Physical clustering of differentially expressed genes 

A group of Fi individuals resistant to PM and a group of 
Fi individuals susceptible to PM were selected (Figure 1). 
Using standard differential expression analysis [12], thirty 
transcripts whose expression levels were differentially 
expressed between the PM-resistant and PM-susceptible 
phenotype groups were identified (q-value < 0.05; Table 1; 
Additional file 1: Table SI). The physical locations of the 
genes encoding these transcripts on the -742 Megabase 
(Mb), 17-chromosome apple genome [7] were determined 
using the BLAST [13] server on the Genome Database for 
Rosaceae [14]. Twenty- four of the genes were located 
on chromosome 12 (Figure 2a), with nineteen being 
clustered within a 10 Mb region centered on the major, 
previously-identified PM quantitative trait locus (QTL) 
segregating in the in the 03 x R5 Fi breeding popula- 
tion (Figures 3a and 4). 

Similarly, a group of Fi individuals resistant to WAA 
and a group of Fi individuals susceptible to WAA [8,10] 
were selected (Figure 1). Using standard differential 
expression analysis, seven transcripts that were differen- 
tially expressed between the WAA-resistant and WAA- 
susceptible groups were identified (Table 1; Additional 
file 1: Table SI). The genes encoding five of these tran- 
scripts lay on chromosome 17 (Figure 2b), all within the 
top 9 Mb of chromosome 17 (Figure 3b), and three of 
these were within about 1 Mb of a major, previously- 
identified WAA resistance QTL (Figures 3b and 4). 

Additionally, we performed similar analyses using gene 
expression markers [3] with clear segregation patterns in 
the apple Fi population as molecular traits (Additional 
file 2: Figure SI). Trees were separated into phenotypic 
groups based on the expression level of the gene expres- 
sion marker, with one group containing plants with high 
expression levels of the gene expression marker, and the 
other group containing plants with low expression levels 
of the gene expression maker. Then, genes that were 
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Figure 1 Clustering of tlie 48 arrays based on thieir overall similarity in gene expression patterns. The tree represented by each array is 
indicated by tree identification number. The powdery mildew and woolly apple aphid resistance phenotype for each tree is indicated. 



differentially expressed between the two phenotypic 
groups were identified by standard differential expression 
analysis. In all cases examined, genes that were differen- 
tially expressed between phenotype groups based on gene 
expression markers were disproportionately located on 
single chromosomes (Figure 2c- f, and Additional file 1: 
Table SI), often clustering in the physical vicinity of the 



gene expression marker gene used to define the pheno- 
typic groups (Figure 3c-e), or occasionally clustering at 
a separate location (Figure 3f; and Additional file 1: 
Table SI). 

The expression levels of the physically clustered, differen- 
tially expressed genes had a mixture of positive and nega- 
tive associations with their associated trait (Figures 4 & 5). 
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Table 1 Differentially expressed genes for powdery mildew and woolly apple aphid resistance identified by q-value 
analysis 
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Table 1 Differentially expressed genes for powdery mildew and woolly apple aphid resistance identified by q-value 
analysis (Continued) 
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*R-S = resistant - susceptible. 



For example, some of the physically clustered genes with 
expression levels associated with PM resistance had higher 
expression levels in the resistant trees (positive phenotype 
association), while others had lower expression levels in the 
resistant trees (negative phenotype association), and the 
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expression levels of intervening genes had no phenotype as- 
sociation (Figures 4 & 5). In some instances, differentially 
expressed genes with positive and negative phenotype asso- 
ciations were in juxtaposition (Figures 4 & 5) with as 
little as 90 kb distance between them (Additional file 1: 
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Figure 2 Physical clustering of differentially expressed genes at the genome level. Genes that were differentially expressed between 
phenotypic groups of trees were mapped to each of the seventeen apple chromosomes, as indicated. Phenotypic groups were developed based 
on resistance to powdery mildew disease (a), resistance to woolly apple aphid (b) and several gene expression markers (GEMs, c-f). 
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Figure 3 Physical clustering of differentially expressed genes at the chromosome level. Genes that were differentially expressed between 
phenotypic groups of trees were mapped on to the chromosome where they were most abundant for that trait, as indicated. The distribution 
along single chromosomes of genes that were differentially expressed between phenotypic groups of trees based on resistance to powdery 
mildew disease (a), resistance to woolly apple aphid (b), and four different gene expression markers (GEMs, c-f) are shown. Asterisk indicates 
chromosome segment containing the major, previously identified QTL for the trait (a, b) or containing the gene expression marker gene used as 
the molecular trait (c-e). 



Table SI). Gene expression marker genes with expres- 
sion patterns not associated with the PM resistance 
phenotype were visible within the PM QTL region 
(Figure 5). 

It is notable that the genes with trait-associated ex- 
pression levels did not necessarily have the same expres- 
sion pattern in all individuals in a phenotype group. For 
example, PM resistance-associated genes APPLEOFOOOOO 
1606, APPLE0FR00048809, and APPLE0F000052120 are 
visibly quite consistent in their expression within a 
phenotypic group, while APPLE0F000026140 and APP 
LE0F000002331 are less consistent within each pheno- 
typic group (Figure 5). 

Validation of gene expression level heritability and 
consistency 

The heritability of expression level of selected genes with 
trait-associated expression levels was validated by quan- 
titative polymerase chain reaction (qPCR) analysis using 
the 03 and R5 parents and a group of Fi individuals in 
the 03 X R5 Fi population growing in a location differ- 
ent from the trees used for DNA microarray analysis and 
sampled during a different year. qPCR analysis showed 
that PM resistance-associated gene APPLE0FR00048809 



had higher expression in parent R5 compared to par- 
ent 03 (Additional file 2: Figure S2), just as predicted 
by the microarray experiment. Furthermore, expres- 
sion of APPLE0FR00048809 among the 03 x R5 Fi 
population used for qPCR analysis segregated at a 1:1 
ratio (Additional file 2: Figure S2), just as predicted by 
the DNA microarray data. In addition, a gene expres- 
sion marker that had a distinctly bimodal expression in 
the DNA microarray analysis (APPLE0F000001974) 
also had bimodal expression in the 03 x R5 Fi population 
used for qPCR validation; this bimodal expression segrega- 
tion was visible among PGR amplicons (Additional file 2: 
Figure S3). Finally, the relative gene expression level rela- 
tionships between APPLE0FR00068101 and several genes 
with associated expression patterns and which exhibited 
continuous expression level distribution in the array data 
(APPLE00R00024612, APPLE0F000011491, APPLEOFOOO 
050102) were maintained in the qPGR relative expression 
data (Additional file 1: Tables S2 and S3). 

Development of DNA markers based on genes with 
trait-associated expression level 

As a proof of concept, DNA markers were developed for 
WAA resistance-associated gene APPLE0FR00068101. 
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Figure 4 Expression patterns of piiysically clustered differentially expressed genes. For each trait, a subset of all the differentially expressed 
genes was located within a 10 Megabase (Mb) window centered on the physical location of the trait of interest in the apple genome. For 
powdery mildew resistance, a 1.6 Mb expanded window shows details, including transcript identifier numbers. Differentially expressed genes 
correlating with powdery mildew and woolly apple aphid resistance included some having higher expression (positive correlation) and some 
having lower expression (negative correlation) in resistant trees. Similarly, differentially expressed genes correlating with gene expression markers 
included some having higher expression (positive correlation) and some having lower expression (negative correlation) in trees where the gene 
expression marker gene expression level was high. Chr, chromosome. 



The sequence of APPLE0FR00068101 was used to identify 
Mains X domestica contig MDCO 15568.236 (6,831 bp 
long), which contained the best matching DNA sequence 
according to BLAST analysis. Several polymorphisms were 
identified within this region when comparing the 03 
haplo-contigs with the R5 haplo-contigs. Of particular 
interest for easy marker development were two microsat- 
ellite regions between bases 2,500 - 3,500 of contig 
MDCO 15568.236, for which PGR primer pairs were de- 
signed (Waa68101-236ssr, forward primer 5'-GGGTTG 
AAGTGCGAGAC-3', reverse primer 5'-CACGCGAC 
GAGGTATTCCAAC-3'; and Waa68101-236Indel, forward 
primer 5'-CCAAATTATGCATACAGATG-3', reverse pri- 
mer 5'-GATTAATGATTAGAAGAAC-3') and tested 
with parent DNA with annealing temperature gradient 
PGR (Additional file 2: Figure S4). Both markers were 
polymorphic between the parents, but only the 
Waa68101-236ssr was heterozygous in the R5 parent, 
showing bands at approximately 360 bp, 460 bp and 
520 bp (Additional file 2: Figure S4). Segregation ana- 
lysis in the 03 x R5 population showed very strong as- 
sociation of the Waa68101-236ssr 360 bp band with 
resistance (p = 0.0001). The Waa68101-236ssr SSR 
marker was more predictive of WAA resistance in the 
03 X R5 population than the published interval 



containing the R5 -derived Er2 gene delineated between 
SSR markers GD96 (MDG021359.285 at 11,796 Kb on 
Ghrl7) and GD153 (MDG013709.214 at 9,138 Kb on 
Ghrl7) [7,10]. 

Discussion 

The results of array clustering for the 48 microarrays in- 
dicated that overall gene expression patterns of individ- 
ual plants were not robust indicators of PM or WAA 
resistance phenotype. In contrast, the differential gene 
expression analysis based on phenotypic groups of Fi in- 
dividuals yielded relatively small numbers of genes that 
were differentially expressed between the phenotypic 
groups, and these differentially expressed genes dis- 
played a remarkable degree of physical clustering on the 
apple genome. The clustered genes were typically in the 
physical vicinity of the major locus controlling the trait, 
in the case of PM and WAA resistance, or in the vicinity 
of the gene expression marker used as the molecular 
trait. Glusters at locations unlinked to their correspond- 
ing trait of interest (Figure 3f; and Additional file 1: 
Table SI) might represent the locations of other QTLs 
related to the phenotype. All of the phenotypes exam- 
ined in this study were controlled by single, major, dom- 
inant QTL, which allowed detection of linkage using 
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APPLE0F000001606 
APPLE0FR00048809 



APPLE0F000025011 



APPLE0F000002331 



APPLE0F000052120 
APPLE0F000021822 
APPLE0F000001330 
APPLEOF000026140 



APPLE0F000027353 



APPLE0F000002620 



30.9 Mb 



Fold change in transcript abundance 

Figure 5 Expression heat map of genes in the area of the 
powdery mildew resistance QTL. Genes are arranged in their 
linear order along the chromosome, and trees are divided into 
groups according to powdery mildew resistance phenotype. Each 
column of colored blocks represents gene expression readings from 
one individual tree. Data for all genes queried by the microarray 
lying between positions 28.7-30.9 Mb of chromosome 12 are shown. 
Green blocks indicate trees where the expression of a given gene 
was lower than the average for that gene across all 48 trees; red 
blocks indicate plants where the expression of a given gene was 
higher than the average for that gene across all 48 trees. Genes 
differentially expressed between the powdery mildew disease 
resistance phenotype groups are indicated by sequence ID numbers. 
Gene expression markers with segregation expression patterns that 
did not correlate with powdery mildew disease resistance are 
denoted by diamonds. Chr, chromosome. 



only 48 Fi individuals. Analysis of larger numbers of in- 
dividuals would certainly be required in order to analyze 
multi-locus traits effectively. 

The clustered differentially expressed genes are not ne- 
cessarily involved in controlling their associated pheno- 
type. For example, the differentially expressed genes 
associated with PM resistance did not show any obvious 
functional patterns or similarities (Table 1). It is also im- 
portant to note here that differentially expressed genes we 
examined here were not selected based on their induced 
expression during pathogen or insect interaction. Rather, 
the differentially expressed genes represented transcripts 
whose steady-state expression levels in healthy tissue were 
associated with PM or WAA resistance phenotype status. 
It is possible that examining differential gene expression 
using infected or infested samples might mask the 



clustering due to the large numbers of genes being up- 
and down-regulated in response to the stress. 

The clustering pattern of differentially expressed genes 
is consistent with the relative expression levels of these 
genes being inherited from a parent. This is different 
from genome neighborhood effects, where groups of 
linked genes are typically up- or down- regulated to- 
gether [15]. Just as DNA polymorphism markers can be 
linked to a trait locus, expression patterns of some of 
the nearby genes are also linked. By grouping trees ac- 
cording to an inherited trait of interest, one might expect 
that the differentially expressed genes would be identified 
simply due to their decreased expression variation within 
a particular pool compared to non-correlating genes at 
loci unlinked to the trait of interest. However, it is remark- 
able that the differential expression patterns between the 
phenotypic groups included so few genes, and that so 
many of these were physically clustered. This suggests that 
heritable differences in gene relative expression were pre- 
dominantly detected by the analysis, rather than genes 
whose expression levels might contribute to or be neces- 
sary for the development of the phenotype. Such genes 
would be expected to be scattered randomly across the 
genome. Our results are similar to those seen in Brassica 
napus using single nucleotide polymorphisms (SNPs) and 
gene expression markers [11]. However, the present study 
used a segregating apple tree Fi population, while the 
other study used a collection of accessions. 

Validation of the expression patterns using qPCR indi- 
cated that most of the genes indeed had patterns of ex- 
pression consistent with the array data. The congruence 
of DNA microarray and qPCR data for selected differen- 
tially expressed genes and gene expression markers pro- 
vided strong validation for the DNA microarray data. 
qPCR validation was successful using a different set of 
individuals from the same cross in a different environ- 
ment and year from those used for the microarray, indi- 
cating that the differentially expressed genes had relative 
expression levels consistent across different growing 
conditions and years and between different groups of in- 
dividuals from the 03 x R5 Fi population. The develop- 
ment of PGR based molecular markers associated with 
several of the differentially expressed sequences was in 
many cases successful because sequence mutations such 
as large INDELs and microsatellite variation was discov- 
ered within or nearby the target genes. While several 
methods are available to detect polymorphisms in 
marker assisted breeding, markers based on the poly- 
merase chain reaction are still the most accessible and 
least expensive for small scale breeding programs. The 
combination of expression analysis for target identifica- 
tion and sequence based marker development proved a 
good strategy as the PGR markers developed in this 
study have been routinely proven useful in apple 
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rootstock breeding program in Geneva, NY. As RNAseq 
methods become more refined, it may be possible to find 
differentially expressed genes associated with traits of 
interest and at the same time leverage polymorphisms 
contained in the expressed sequences for haplotype spe- 
cific breeding marker development. 

Conclusion 

We have shown in a segregating population from the 
cross of a highly heterozygous plant that gene expression 
analysis can result in identification of differentially 
expressed genes that are physically linked to one an- 
other. Gene expression heritability as a method to detect 
genes physically linked to trait loci of interest could be 
useful in crops for which few genetic and genomic infor- 
mation resources are available. Even in the absence of a 
genetic map, molecular markers linked to traits of inter- 
est could be developed using transcriptome profiles of 
segregating populations, since a substantial proportion 
of the differentially expressed genes would be expected 
to cluster in the vicinity of the trait of interest. While 
there may be no causal link between the differentially 
expressed genes and their associated traits, they do pro- 
vide an excellent starting point for development of DNA 
markers linked to segregating traits of interest. 

Methods 

Plant materials 

The trees used for the DNA microarray analysis were 
from a segregating Fi population from an 03 x R5 cross 
and were grown in an orchard in Geneva, NY [16]. Sam- 
ples were taken in early summer of 2009 from healthy, 
uninfected, uninfested individual shoots from first-year 
growth of 48 plants in a propagation stool bed. Shoot 
tips samples comprised all shoot tissues up to and in- 
cluding the first fully-expanded leaf. Sampled shoots 
were carefully selected so that they were as similar to 
each other as possible in size and shape to minimize 
sampling variation. The samples were flash frozen in li- 
quid nitrogen and stored at -80°C for later RNA 
isolation. 

For the qPCR analyses, shoot tip samples were col- 
lected in late spring of 2013 from a separate group of 46 
individuals belonging to the same 03 x R5 population 
from clonally propagated material in a replicate orchard 
in Geneva, NY. Shoot tips from the population parents 
(03 and R5) were collected from trees at the apple col- 
lection of the USDA ARS Plant Genetic Resources Unit 
in Geneva, NY. 

RNA isolation and microarray analysis 

Total RNA was isolated from whole apple shoot tips as 
previously described [17]. The microarray data used in 
the present study were generated during a previously- 



reported study [18] and subjected to a new analysis. The 
contig sequences used for array probe development are 
accessible at the Gene Expression Omnibus (GEO) data- 
set website [19]. The array used was a second-generation 
array in a 12-plex array format containing 135,000 
probes per plex, representing 26,017 transcripts, enab- 
ling us to query a relatively large number of samples. 
Each transcript was queried by 4-5 probes of 60-70 
bases in length. The array included the best-performing 
probes from the first-generation array and was enriched 
for differentially-expressed genes based on analyses of 
the first-generation array [17]. The genes predicted to 
encode the 26,017 transcripts probed by the array repre- 
sented were evenly and randomly physically distributed 
across the apple tree genome. The expression levels for 
each individual Fi tree were analyzed using a single array 
only; however, analyses were conducted using pooled 
data from trees with similar phenotypes, which repre- 
sented pseudo-replicates in this context [20]. 

While the parents of the breeding population were dif- 
ferent from the varieties used to design the DNA micro- 
array, this did not interfere with probe performance or 
account for patterns. Any nucleotide polymorphisms be- 
tween the probes and the samples, when present, did 
not affect hybridization, as the other probes for the same 
transcripts, which had no polymorphisms, gave similar 
intensity values (Additional file 1: Table S4). In addition, 
mismatches between probes and samples did not correlate 
with variation in between probe signals (Additional file 1: 
Table S4). For example, the probes for APPLE0FR00039157 
had 1 or 5 mismatches to their target per probe, yet they 
produced data with similar signal intensity and standard de- 
viations to the APPLE0FR00031686 probe set, which had 
no mismatches to their target (Additional file 1: Table S4). 

Differential gene expression analysis 

The gene expression data from the DNA microarray 
hybridization experiments were previously normalized 
using R software [18]. To identify differentially expressed 
genes based on PM resistance phenotype, for example, 
the trees were divided into two groups, one group con- 
taining the PM-resistant trees, and the other group con- 
taining the PM-susceptible trees. The mean of the log2 
(expression) value for each transcript was then calcu- 
lated separately in each phenotypic group, and then the 
M-value (log2 fold difference in expression) for each 
transcript was computed as the difference in the mean 
log2 (expression) value for each transcript between the 
two groups of trees. Empirical Bayes ANOVA analysis 
[21] was performed using the LIMMA (Linear Models 
for Microarray Data) package [22] as part of the R Bio- 
conductor suite [23]. P-values (false positive rate) from 
this analysis were then converted to q-values (false dis- 
covery rate) using the methodology of Storey and 
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Tibshirani [11] as implemented in the Bioconductor 
q-value routine [24]. Differentially expressed tran- 
scripts were identified as transcripts with statistically 
different levels of expression between the phenotypic 
groups (q-value < 0.05), regardless of the magnitude of 
the difference (M-value). Differentially expressed genes for 
WAA and the several GEM traits were identified using 
the same approach, using phenotypic groups defined by 
each trait. 

Identification of gene expression markers 

The first step was to calculate the mean of the log2 (expres- 
sion) value for each transcript in the entire Fi population. 
Next, the divergence in expression for each individual tree 
compared to the average was determined for each tran- 
script. GEMs were identified as transcripts that could div- 
ide the trees into groups of roughly equal size based on 
having at least a 1.5-fold difference in expression levels be- 
tween the groups (Additional file 2: Figure SI). 

Expression pattern validation by qPCR 

Shoot tip samples were processed fresh, immediately 
after collection, using the ZR Plant RNA MiniPrep kit 
(Zymo Research, Irvine CA, USA) following the manu- 
facturer s instructions, with the addition of DNAse I 
(Invitrogen, Grand Island, NY, USA) to the RNA wash 
buffer in the kit as recommended in the manufacturer s 
protocol. cDNA synthesis was performed using the 
QuantiTect Reverse Transcription kit (Qiagen, German- 
town, MD, USA) according to the manufacturer s instruc- 
tions. PGR primers for selected differentially expressed 
genes and gene expression markers were optimized using 
parental genomic DNA with Annealing Temperature Gra- 
dient PGR (ATG-PGR; Additional file 1: Table S5). qPCR 
was performed in 25 \A reactions using LightGycler 480 
SYBR Green I Master reaction mix (Roche, Indianapolis, 
IN, USA) and a LightGycler 480 instrument (Roche) A 
Basic Relative Quantification workflow was used that in- 
cluded fluorescence measurement at each PGR extension 
step and a final melting step measuring fluorescence from 
95-55°G for Melt Gurve Genotyping. An actin gene 
(Genbank accession number EE 136338, primers: 5'- 
GGGTGGATTTGGTGGTGATG-3' and 5'-TGGTGAG 
TATGGGGTGGTGA-3') was used as the reference for 
Relative Quantification Analysis (RQA). The Grossing 
Point (Gp) values and ratios between target and references 
were calculated using the LG480 software and algorithms 
(Roche). Melt curves were analyzed for non-specific amp- 
lification peaks (Additional file 2: Figure S2). Amplicons 
were resolved on 1.5% ethidium bromide (EtBr) stained 
agarose gels and visualized with an Alphalmager HP gel 
documentation system (ProteinSimple, Santa Glara, GA, 
USA). 



DNA marker development 

The sequence of gene APPLE0FR00068101, whose expres- 
sion was associated with WAA resistance, was compared 
to the 'Golden Delicious' apple genome hosted by the 
Genome Database for Rosaceae [14] using the National 
Genter for Biotechnology Information (NGBI) s BLASTN 
program [12] to identif)^ four contigs (MDG015568.236, 
MDG013761.427, MDG015568.269, MDG000748.724) 
containing similar sequences (e-values between lE-97 to 
2E-89). All four contigs were located on apple chromo- 
some 17 within a 74 kb interval (genome base pair posi- 
tions 1,405,743-1,479,871; Additional file 2: Figure S5). 
Genomic sequences for parents 03 and R5 had been ob- 
tained by Next-Gen Illumina Hi-Seq paired end sequen- 
cing. Geneious bioinformatic software (Biomatters, San 
Francisco, GA, USA) was used to construct a local align- 
ment of next-gen sequences to the Malus x domestica 
contig containing the complete predicted target gene se- 
quence (MDP). Gontig MDG015568.236 contained se- 
quences most similar to the R5 next-gen sequences, and 
was therefore chosen for further analysis. Unique single 
nucleotide polymorphisms (SNPs), simple sequence re- 
peats (SSRs), and haplotypes were identified for the R5 
(WAA-resistant) parent. Several 18-21 bp PGR primers 
were designed to match unique SNP haplotypes at the 3 ' 
end, and primer pairs were tested with parental DNAs 
using annealing temperature gradient PGR (45°G to 65°G) 
to verify genotype specificity, stability and reproducibility 
of amplicons. Amplicons were resolved on 2% EtBr- 
stained agarose gels and visualized with an Alphalmager 
HP gel documentation system (ProteinSimple). In 
addition, some primers were designed flanking microsatel- 
lite SSRs. Genotype-specific amplicons were then tested 
on segregating individuals in the 03 x R5 population and 
a diversity panel of apple rootstocks to verify genetic in- 
heritance, linkage to other markers, and haplotype 
uniqueness. 

Availability of supporting data 

Microarray data are available through the GEO website 
using accession number GSE43268. 

Additional files 



Additional file 1: Table SI. Transcripts identified as having expression 
patterns associated with powdery mildew resistance, woolly apple aphid 
resistance, and the gene expression marker traits analyzed in a 
segregating Fi population from an 'OttawaS' x 'Robusta 5' cross. Table S2. 
Correlation between Microarray Gene Expression Values for WAA resistance 
differentially expressed genes APPLE0FR00068101, and associated features 
APPLE00R00024612, APPLEOFOOOOl 1491, APPLE0F000050102. Table S3. 
Correlation between Relative Gene Expression values (Target Gene/Actin 
Reference Gene) in qPCR Basic Relative Quantification outputs for the 
following targets: APPLE24612, APPLE50102, APPLEl 1491 and WAA 
resistance differentially expressed gene APPLE681 01. Table S4. Individual 
probe data and SNP counts for transcripts in the region 29.0 to 30.2 Mb on 
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Chromosome 12. Table S5. qPCR conditions and primers for gene 
expression validation. 

Additional file 2: Figure SI. Example gene expression marker (GEM) 
trait. Figure S2. Relative quantification results for qPCR of differentially 
expressed gene APPLE0FR00048809 (associated with PM resistance) 
relative to actin. Figure S3. Visualization of qPCR amplicons of gene 
APPLE0F000001977, showing clear segregation (presence/absence) of 
amplified target cDNA in selected progeny. Figure S4. Annealing 
temperature gradient amplification (65°C - 45°C) of differentially 
expressed gene APPLE0FR00068101 derived markers on parental DMAs. 
Figure S5. Alignment of microarray feature APPLE0FR00068101 to 
Chromosome 17 of the apple genome. 
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